69 research outputs found

    Improving the accuracy of automatic facial expression recognition in speaking subjects with deep learning

    Get PDF
    When automatic facial expression recognition is applied to video sequences of speaking subjects, the recognition accuracy has been noted to be lower than with video sequences of still subjects. This effect known as the speaking effect arises during spontaneous conversations, and along with the affective expressions the speech articulation process influences facial configurations. In this work we question whether, aside from facial features, other cues relating to the articulation process would increase emotion recognition accuracy when added in input to a deep neural network model. We develop two neural networks that classify facial expressions in speaking subjects from the RAVDESS dataset, a spatio-temporal CNN and a GRU cell RNN. They are first trained on facial features only, and afterwards both on facial features and articulation related cues extracted from a model trained for lip reading, while varying the number of consecutive frames provided in input as well. We show that using DNNs the addition of features related to articulation increases classification accuracy up to 12%, the increase being greater with more consecutive frames provided in input to the model

    Robust Face Recognition Providing the Identity and its Reliability Degree Combining Sparse Representation and Multiple Features

    Get PDF
    For decades, face recognition (FR) has attracted a lot of attention, and several systems have been successfully developed to solve this problem. However, the issue deserves further research effort so as to reduce the still existing gap between the computer and human ability in solving it. Among the others, one of the human skills concerns his ability in naturally conferring a \u201cdegree of reliability\u201d to the face identification he carried out. We believe that providing a FR system with this feature would be of great help in real application contexts, making more flexible and treatable the identification process. In this spirit, we propose a completely automatic FR system robust to possible adverse illuminations and facial expression variations that provides together with the identity the corresponding degree of reliability. The method promotes sparse coding of multi-feature representations with LDA projections for dimensionality reduction, and uses a multistage classifier. The method has been evaluated in the challenging condition of having few (3\u20135) images per subject in the gallery. Extended experiments on several challenging databases (frontal faces of Extended YaleB, BANCA, FRGC v2.0, and frontal faces of Multi-PIE) show that our method outperforms several state-of-the-art sparse coding FR systems, thus demonstrating its effectiveness and generalizability

    Sparse Representation Based Classification for Face Recognition by k-LiMapS Algorithm

    Get PDF
    In this paper, we present a new approach for face recognition that is robust against both poorly defined and poorly aligned training and testing data even with few training samples. Working in the conventional feature space yielded by the Fisher\u2019s Linear Discriminant analysis, it uses a recent algorithm for sparse representation, namely k -LiMapS, as general classification criterion. Such a technique performs a local \u21130 pseudo-norm minimization by iterating suitable parametric nonlinear mappings. Thanks to its particular search strategy, it is very fast and able to discriminate among separated classes lying in the low-dimension Fisherspace. Experiments are carried out on the FRGC version 2.0 database showing good classification capability even when compared with the state-of-the-art \u21131 norm-based sparse representation classifier (SRC)

    High-rate compression of ECG signals by an accuracy-driven sparsity model relying on natural basis

    Get PDF
    Long duration recordings of ECG signals require high compression ratios, in particular when storing on portable devices. Most of the ECG compression methods in literature are based on wavelet transform while only few of them rely on sparsity promotion models. In this paper we propose a novel ECG signal compression framework based on sparse representation using a set of ECG segments as natural basis. This approach exploits the signal regularity, i.e. the repetition of common patterns, in order to achieve high compression ratio (CR). We apply k-LiMapS as fine-tuned sparsity solver algorithm guaranteeing the required signal reconstruction quality PRDN (Normalized Percentage Root-mean-square Difference). Extensive experiments have been conducted on all the 48 records of MIT-BIH Arrhythmia Database and on some 24 hour records from the Long-Term ST Database. Direct comparisons of our method with several state-of-the-art ECG compression methods (namely ARLE, Rajoub's, SPIHT, TRE) prove its effectiveness. Our method achieves average performances that are two-three times higher than those obtained by the other assessed methods. In particular the compression ratio gap between our method and the others increases with growing PRDN

    Precise eye localization through a general-to-specific model definition

    Full text link
    We present a method for precise eye localization that uses two Support Vector Machines trained on properly selected Haar wavelet coefficients. The evaluation of our technique on many standard databases exhibits very good performance. Furthermore, we study the strong correlation between the eye localization error and the face recognition rate

    Face recognition in uncontrolled conditions using sparse representation and local features

    Get PDF
    Face recognition in presence of either occlusions, illumination changes or large expression variations is still an open problem. This paper addresses this issue presenting a new local-based face recognition system that combines weak classifiers yielding a strong one. The method relies on sparse approximation using dictionaries built on a pool of local features extracted from automatically cropped images. Experiments on the AR database show the effectiveness of our method, which outperforms current state-of-the art techniques

    Orthogonal procrustes analysis for dictionary learning in sparse linear representation

    Get PDF
    In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD's robustness and wide applicability

    Robust single-sample face recognition by sparsity-driven sub-dictionary learning using deep features

    Get PDF
    Face recognition using a single reference image per subject is challenging, above all when referring to a large gallery of subjects. Furthermore, the problem hardness seriously increases when the images are acquired in unconstrained conditions. In this paper we address the challenging Single Sample Per Person (SSPP) problem considering large datasets of images acquired in the wild, thus possibly featuring illumination, pose, face expression, partial occlusions, and low-resolution hurdles. The proposed technique alternates a sparse dictionary learning technique based on the method of optimal direction and the iterative \u2113 0 -norm minimization algorithm called k-LIMAPS. It works on robust deep-learned features, provided that the image variability is extended by standard augmentation techniques. Experiments show the effectiveness of our method against the hardness introduced above: first, we report extensive experiments on the unconstrained LFW dataset when referring to large galleries up to 1680 subjects; second, we present experiments on very low-resolution test images up to 8 7 8 pixels; third, tests on the AR dataset are analyzed against specific disguises such as partial occlusions, facial expressions, and illumination problems. In all the three scenarios our method outperforms the state-of-the-art approaches adopting similar configurations

    Fiducial point localization in color images of face foregrounds

    No full text
    We describe a method for the automatic identification of facial features (eyes, nose, mouth and chin) and the precise localization of their fiducial points (e.g. nose tip, mouth and eye corners) in color images of face foregrounds. The algorithm requires as input 2D color images, representing face foregrounds with homogeneous background; it is scale-independent, it deals with either frontal, rotated (up to 30\ub0) or slightly tilted (up to 10\ub0) faces, and it is robust to different facial expressions, requiring the mouth closed and the eyes opened, and no wearing glasses. The method proceeds with subsequent refinements: first, it identifies the sub images containing each feature, afterwards, it processes the single features separately by a blend of techniques which use both color and shape information. The system has been tested on three databases: the XM2VTS database, the University of Stirling database, and ours, for a total of 1650 images. The obtained results are described quantitatively and discussed
    • …
    corecore